#compositional generalization01/07/2025
OMEGA Benchmark: Testing the Creative Limits of AI in Math Reasoning
OMEGA is a novel benchmark designed to probe the reasoning limits of large language models in mathematics, focusing on exploratory, compositional, and transformational generalization.